Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classi er

نویسندگان

  • Pedro Domingos
  • Michael Pazzani
چکیده

The simple Bayesian classi er (SBC) is commonly thought to assume that attributes are independent given the class, but this is apparently contradicted by the surprisingly good performance it exhibits in many domains that contain clear attribute dependences. No explanation for this has been proposed so far. In this paper we show that the SBC does not in fact assume attribute independence, and can be optimal even when this assumption is violated by a wide margin. The key to this nding lies in the distinction between classi cation and probability estimation: correct classi cation can be achieved even when the probability estimates used contain large errors. We show that the previously-assumed region of optimality of the SBC is a second-order in nitesimal fraction of the actual one. This is followed by the derivation of several necessary and several su cient conditions for the optimality of the SBC. For example, the SBC is optimal for learning arbitrary conjunctions and disjunctions, even though they violate the independence assumption. The paper also reports empirical evidence of the SBC's competitive performance in domains containing substantial degrees of attribute dependence. 1 THE SIMPLE BAYESIAN

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier

The simple Bayesian classi er (SBC) is commonly thought to assume that attributes are independent given the class, but this is apparently contradicted by the surprisingly good performance it exhibits in many domains that contain clear attribute dependences. No explanation for this has been proposed so far. In this paper we show that the SBC does not in fact assume attribute independence, and ca...

متن کامل

Visualizing the Simple Bayesian Classi

The simple Bayesian classi er (SBC), sometimes called Naive-Bayes, is built based on a conditional independence model of each attribute given the class. The model was previously shown to be surprisingly robust to obvious violations of this independence assumption, yielding accurate classi cation models even when there are clear conditional dependencies. The SBC can serve as an excellent tool fo...

متن کامل

Searching for Dependencies in Bayesian Classifiers

Naive Bayesian classi ers which make independence assumptions perform remarkably well on some data sets but poorly on others. We explore ways to improve the Bayesian classi er by searching for dependencies among attributes. We propose and evaluate two algorithms for detecting dependencies among attributes and show that the backward sequential elimination and joining algorithm provides the most ...

متن کامل

An Effective Bayesian Neural Network Classifier with a Comparison Study to Support Vector Machine

We propose a new Bayesian neural network classiŽer, different from that commonly used in several respects, including the likelihood function, prior speciŽcation, and network structure. Under regularity conditions, we show that the decision boundary determined by the new classiŽer will converge to the true one. We also propose a systematic implementation for the new classiŽer. In our implementat...

متن کامل

Improving Simple Bayes

The simple Bayesian classi er (SBC), sometimes called Naive-Bayes, is built based on a conditional independence model of each attribute given the class. The model was previously shown to be surprisingly robust to obvious violations of this independence assumption, yielding accurate classi cation models even when there are clear conditional dependencies. We examine di erent approaches for handli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996